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Abstract 



Consider the following problem: given a metric space, some of whose points are "clients," select a set 
of at most k facility locations to minimize the average distance from the clients to their nearest facility. 
This is just the well-studied fc-median problem, for which many approximation algorithms and hardness 
results are known. Note that the objective function encourages opening facilities in areas where there are 
many clients, and given a solution, it is often possible to get a good idea of where the clients are located. 
This raises the following quandary: what if the locations of the clients are sensitive information that we 
would like to keep private? Is it even possible to design good algorithms for this problem that preserve 
the privacy of the clients? 

In this paper, we initiate a systematic study of algorithms for discrete optimization problems in the 
framework of differential privacy (which formalizes the idea of protecting the privacy of individual input 
elements). We show that many such problems indeed have good approximation algorithms that preserve 
differential privacy; this is even in cases where it is impossible to preserve cryptographic definitions of 
privacy while computing any non-trivial approximation to even the value of an optimal solution, let alone 
the entire solution. 

Apart from the fc-median problem, we consider the problems of vertex and set cover, min-cut, facil- 
ity location, and Steiner tree, and give approximation algorithms and lower bounds for these problems. 
We also consider the recently introduced submo dular m aximization problem, "Combinatorial Public 
Projects" (CPP), shown by Papadimitriou et al. [ |PSSO^ to be inapproximable to subpolynomial mul- 



tiplicative factors by any efficient and truthful algorithm. We give a differentially private (and hence 
approximately truthful) algorithm that achieves a logarithmic additive approximation. 



1 Introduction 



Consider the following problems: 

• Assign people using a social network to one of two servers so that most pairs of friends are assigned to 
the same server. 

• Open some number of HIV treatment centers so that the average commute time for patients is small. 

• Open a small number of drop-off centers for undercover agents so that each agent is able to visit some 
site convenient to her (each providing a list of acceptable sites). 

The above problems can be modeled as instances of well-known combinatorial optimization problems: 
respectively the minimum cut problem, the fc-median problem, and the set cover problem. Good heuristics 
have been designed for these problems, and hence they may be considered well-studied and solved. However, 
in the above scenarios and in many others, the input data (friendship relations, medical history, agents' 
locations) represent sensitive information about individuals. Data privacy is a crucial design goal, and it 
may be vastly preferable to use a private algorithm that gives somewhat suboptimal solutions to a non- 
private optimal algorithm. This leads us to the following central questions: Given that the most benign of 
actions possibly leaks sensitive information, how should we design algorithms for the above problems? What 
are the fundamental trade-offs between the utility of these algorithms and the privacy guarantees they give 
us? 

The notion of privacy we consider in this paper is that of differential privacy. Informally, differential 
privacy guarantees that the distribution of outcomes of the computation does not change significantly when 
one individual changes her input data. This is a very strong privacy guarantee: anything significant about 
any individual that an adversary could learn from the algorithm's output, he could also learn were the 
individual not participating in the database at all — and this holds true no matter what auxiliary information 
the adversary may have. This definition guarantees privacy of an individual's sensitive data, while allowing 
the computation to respond when a large number of individuals change their data, as any useful computation 
must do. 

1.1 Our Results 

In this paper we initiate a systematic study of designing algorithms for combinatorial optimization problems 
under the constraint of differential privacy. Here is a short summary of some of the main contributions of 
our work. 



While the exponential mechanism of | MT07 ] is an easy way to obtain computationally inefficient private 
approximation algorithms for some problems, the approximation guarantees given by a direct applica- 
tion of this can be far from optimal (e.g., see our results on min-cut and weighted set cover). In these 
cases, we have to use different techniques — often more sophisticated applications of the exponential 
mechanism — to get good (albeit computationally expensive) solutions. 

However, we want our algorithms to be computationally efficient and private at the same time: here 
we cannot use the exponential mechanism directly, and hence we develop new algorithmic ideas. We 
give private algorithms for a wide variety of search problems, where we must not only approximate the 



value of the solution, but also produce a solution that optimizes this value. See Table 1 for our results 



For some problems, unfortunately, just outputting an explicit solution might leak private information. 
For example, if wc output a vertex cover of some graph explicitly, any pair of vertices not output 
reveals that they do not share an edge — so any private explicit vertex cover algorithm must output 
n — 1 vertices. To overcome this hurdle, we instead privately output an implicit representation of a 
small vertex cover — we view vertex cover as a location problem, and output an orientation of the 
edges. Each edge can cover itself using the end point that it points to. The orientation is output 
privately, and the resulting vertex cover approximates the optimal vertex cover well. We deal with 
similar representational issues for other problems like set cover as well. 

We also show lower bounds on the approximation guarantees regardless of computational considera- 
tions. For example, for vertex cover, we show that any e-differentially private algorithm must have 
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Non-private 


EfRcicnt Algorithms 


Information Theoretic 


Vertex Cover 


2 X OPT pit85| 


(2 + 16/e) X OPT 


e(l/e) X OPT 


Wtd. Vertex Cover 


2 X OPT Hoc82 


(16 + 16/e) X OPT 


9(l/e) X OPT 


Set Cover 


ln?i X OPT [|Joh74| 


0(lnn + lnm/e) X OPT f 


e(lnm/e) x OPT 


Wtd. Set Cover 


Inn X OPT [phv79[l 


0(ln n(ln rn + In In n) /e) x OPT f 


e(hrm/e) x OPT 


Min Cut 


OPT [FF56| 


0PT + O(lnn/e) f 


OPT + e(hin/e) 


CPPP 


(1 - 1/e) X OPT INWF781 


(1 - 1/e) X OPT- 0(/slnm/e) f 


OPT-e(A:ln(m/A:)/e) 


A:-Median 


(3 + 7) X OPT |AGK+04| 


6 X 0PT + O(fc2 In^n/e) 


OPT + e(/cln(n/fc)/e)'' 



Tabic 1: Summary of Results. Results in the second and third columns are from this paper. 



"[FFKNOQ] independently prove a similar lower bound. 



an approximation guarantee of f7(l/e). We show that each of our lower bounds are tight: we give 
(computationally inefficient) algorithms with matching approximation guarantees. 



• Our results have implications beyond privacy as well: Papadimitriou et al. |PSS08| introduce the 
Combinatorial Public Project problem, a special case of submodular maximization, and show that the 
problem can be well approximated by either a truthful mechanism or an efficient algorithm, but not by 
both simultaneously. In contrast to this negative result, we show that under differential privacy (which 
can be interpreted as an approximate but robust alternative to truthfulness) we can achieve the same 
approximation factor as the best non-truthful algorithm, plus an additive logarithmic loss. 

• Finally, we develop a private amplification lemma: we show how to take private algorithms that gives 
bounds in expectation and efficiently convert them (priva tely) into bounds with high probability. This 
answers an open question in the paper of Fcldman et al. | FFKN09| . 



[Table 1 summarizes the bounds we prove in this paper. For each problem, it reports (in the first 
column) the best known non-private approximation guarantees, (in the second column) our best efficient 
e-differcntially private algorithms, and in each (in the third column) case matching upper and lower bounds 
for inefficient e-differentially private algorithms. For a few of the efficient algorithms (marked with a f) the 
guarantees are only for an approximate form of differential privacy, incorporating a failure probability 5, and 
scaling the effective value of e up by ln(l/(5). 

1.2 Related Work 



Differential privacy is a relatively recent privacy definition (e.g., see |DMNS06, Dwo06, NRS07, BLRO^ , 



KLN+08| , |FFKN09| , pNR+09| , and see pwo08| for an excellent survey), that tries to capture the intuition of 
individual privacy. Many algorithms in this framework have focused on measurement, statistics, and learning 
tasks applied to statistical data sets, rather than o n proce ssing and producing combinatorial objects. One 
exception to this is the Exponential Mechanism of MT07 | which allows the selection from a set of discrete 
alternatives. 



Independently, Feldman et al. |FFKNO£] also consider the problem of privately approximating fc-medians 
for points in Sft''. Their model differs slightly from ours, which makes the results largely incomparable: while 
our results for general metrics translated to give smaller additive errors than theirs, we only output a 
fc-mcdian approximation whereas they output coresets for the problem. Their lower bound argument for 
private coresets is similar to ours. 

Prior work on Secure Function Evaluation (SEE) tells us that in fact the minimum cut in a graph can be 
computed in a distributed fashion in such a way that computations reveals nothing that cannot be learnt from 
the output of the computation. While this is a strong form of a privacy guarantee, it may be unsatisfying to 
an individual whose private data can be inferred from the privately computed output. Indeed, it is not hard 
to come up with instances where an attacker with some limited auxiliary information can infer the presence 
or absence of specific edges from local information about the minimum cut in the graph. By relaxing the 
whole input privacy requirement of SEE, differential privacy is able to provide unconditional per element 
privacy, which SEE need not provide if the output itself discloses properties of input. 
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Feigenbaum et al. [FIM+06| extend the notion of SFE to NP hard problems for which efBcient algorithms 
must output an approximation to the optimum, unless P=NP. They defined as functional privacy the con- 
straint that two inputs with the same output value (e.g. the size of an optimal vertex cover) must produce 
the same value under the approximation algorithm. Under this constraint, Halevi et al. |HKKN01| show 
that approximating the value of vertex cover to within n^~^ is as hard as computing the value itself, for any 
constant ^. These hardness results were extended to search problems by Beimel et al. [ BCNWO^ , where the 
constraint is relaxed to only equate those inputs whose sets of optimal solutions are identical. These results 



were extended and strengthened by Beimel e t al. | |BHN07| , |BMNW07 |. 

Nonetheless, Feigenbaum et al. [ FIM~''06| and others show a number of positive approximation results 
under versions of the functional privacy model. Halevi et al. |HKKN01| provide positive results in the 
function privacy setting when the algorithm is permitted to leak few bits (each equivalence class of input 
need not produce identical output, but must be one of at most 2^ possible outcomes). Indyk and W oodru ff 
also give some positive results for the approximation of £2 distance and a nearest neighbor problem [[W06]. 
However, as functional privacy extends SFE, it does not protect sensitive data that can be inferred from the 
output. 

Nevertheless, SFE provides an implementation of any function in a distributed setting such that nothing 
other than the output of the function is revealed. One can therefore run a differentially private algorithm is 
a distributed manner using SFE (see e.g. [DKM+06, BNO08|), in the absence of a trusted curator. 



2 Definitions 

Differential privacy is a privacy definition for computations run against sensitive input data sets. Its require- 
ment, informally, is that the computation behaves nearly identically on two input data sets that are nearly 
identical; the probability of any outcome must not increase by more than a small constant factor when the 
input set is altered by a single element. Formally, 



Definition 1 ([DMNS06]). We say a randomized computation Al has e-differential privacy if for any two 
input sets A and B with symmetric difference one, and for any set of outcomes S C Range{M), 

Pr[M{A) eS] < exp(e) x Pr[M{B) e S] . (2.1) 

The definition has several appealing properties from a privacy perspective. One that is most important 
for us is that arbitrary sequences of differentially private computations are also differentially private, with 
an e parameter equal to the sum of those comprising the sequence. This is true even when subsequent com- 



putations can depend on and incorporate the results of prior differentially private computations |DKM''"06|, 
allowing repetition of differentially private steps to improve solutions. 

2.1 Approximate Differential Privacy 



One relaxation of differential privacy [|DKM+06| allows a small additive term in the bound: 



Definition 2. We say a randomized computation Al has (5 -approximate e-differential privacy if for any two 
input sets A and B with symmetric difference one, and for any set of outcomes S C Range{AI) , 



Pr[Af (A) e S] 



< 



exp(e) X Pr[AI{B) e S] + S . 



(2.2) 



The flavor of guarantee is that although not all events have their probabilities preserved, the alteration 
is only for very low probability events, and is very unlikely to happen. The 6 is best thought of as \/poly{n) 
for a data set containing some subset of n candidate records. We note that there are stronger notions 
of approximate differential privacy (c.f. [MKA+08|), but in our settings, they are equivalent upto poly{n) 
changes in 5. We therefore restrict ourselves to this definition here. 

2.2 The Exponential Mechanism 



One particularly general tool that we will often use is the exponential mechanism of |MT07|. This construc- 
tion allows differentially private computation over arbitrary domains and ranges, parametrized by a query 
function q[A, r) mapping a pair of input data set A (a multiset over some domain) and candidate result r to 
a real valued "score" . With q and a target privacy value e, the mechanism selects an output with exponential 
bias in favor of high scoring outputs: 

4 



Pr[£^{A)=r] oc exp{eq{A,r)) 



(2.3) 



If the query function q has the property that any two adjacent data sets have score within A of each other, 
for all possible outputs r, the mechanism provides 2eA-differential privacy. Typically, we would normalize q 
so that A = 1. We will be using this mechanism almost exclusively over discrete ranges, where we can derive 



the following simple analogue of a theorem of |MT07|, that the probability of a highly suboptimal output is 
exponentially low: 

Theorem 2.1. The exponential mechanism, when used to select an output r £ R gives 2€lS.- differential 
privacy, letting Rqpj he the subset of R achieving q{A,r) ~ maxj. (/(A, r), ensures that 

Vr[q{A,£l{A))<mi,xq{A,r)-H\R\/\Ropj\)/e^t/e] < exp(-t) . (2.4) 

The proof of the theorem is almost immediate: any outcome with score less than max^ q{A, r) — 
ln(|i?|/|i?opT|)/e — i/e will have normalized probability at most exp(— i)/|i?|; each has weight at most 
cxp(OPT — t)|i?0PT|/|-R|, but is normalized by at least |i?0PT| exp(OPT) from the optimal outputs. As 
there are at most \R\ such outputs their cumulative probability is at most exp(— i). 

3 Private Min-Cut 

Given a graph G — (V,E) the minimum cut problem is to find a cut (5*, S*"^) so as to minimize E{S,S'^). 
In absence of privacy constraints, this problem is effi ciently solvable exactly. However, outputting an exact 



solution violates privacy, as we show in Section 3A. Thus, we give an algorithm to output a cut within 
additive 0(logn/e) edges of optimal. 

The algorithm has two stages: First, given a graph G, we add edges to the graph to raise the cost 
of the min cut to at least 41nn/e, in a differentially private manner. Second, we deploy the exponential 
mechanism over all cuts in the graph, using a theorem of Karger to show that for graphs with min cut 
at least 41nn/e the number of cuts within additive t of OPT increases no faster than exponentially with 
t. Although the exponential mechanism takes time exponential in n, we can construct a polynomial time 
version by considering only the polynomially many cuts within 0(lnn/e) of OPT. Below, let Gost{H, [S, S'^)) 
denote the size Eh{S, S'^) of the cut (5, 5^) in a graph H . 

Algorithm 1 The Min-Cut Algorithm 
1: Input: G = {V,E),t. 

2: Let Hq C Hi, . . . ,<Z be arbitrary strictly increasing sets of edges on V. 

3: Choose index i g [0, (2)] with probability proportional to exp(— e|OPT(G U Hi) — 81nn/e|). 

4: Choose a subset S G 2^ \ {0, V} with probability proportional to cxp{-eCost{G U Hi, (S, S"))). 

5: Output the cut C = (5', 5"=). 



Our result relies on a result of Karger about the number of near- minimum cuts in a graph |Kar93| 
Lemma 3.1 ([Kar93 ). For any graph G with min cut C, there are at most n^" cuts of size at most aC. 



By enlarging the size of the min cut in G U Hj to at least 41nn/e, we ensure that the number of cuts of 
value OPT(G U Hi) + t is bounded by cxp(ei/2). The downwcighting of the exponential mechanism will 
be able to counteract this growth in number and ensure that we select a good cut. 

Theorem 3.2. For any graph G, the expected cost of ALG is at most OPT + 0(lnn/e). 

Proof. First, we argue that the selected index i satisfies Alnn/e < OPT(GU Hi) < OPT(G) + 121nn/e with 



probability at least 1 — l/n^. For OPT > 8 Inn/e, Equation 2.4 ensures that the probability of exceeding the 
optimal choice (Hq) by 41nn/e is at most 1 — Likewise, for OPT < 81nn/e, there is some optimal Hi 

achieving min cut size 81nn/e, and the probability we end up farther away than 41nn/e is at most 1 — 

Assuming now that OPT(GU-ffi) > 41n7T./e, Karger's lemma argues that the number ct of cuts in GU Hi 
of cost at most OPT(GU iJ^) + Ms at most exp{et/2). As we are assured a cut of size OPT{GU Hi) exists, 
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each cut of size OPT(G' U Hi) + t will receive probability at most exp(— ei). Put together, the probability of 
a cut exceeding OPT(G U Hi) + b is at most 



Pr[Cost{G U H„C) > OPTiGU Hi) +b] < ^ exp(-et)(ct - ct_i) (3.5) 

t>b 

< (exp(e)-l)^exp(-et)ct (3.6) 

t>6 

< (exp(e) - l)^exp(-et/2)n2 (3.7) 

t>6 

The sum telescopes to exp(— e6/2)n^/(exp(e/2) — 1), and the denominator is within a constant factor of the 
leading factor of (exp(e) — 1), for e < 1. For b = 81nn/e, this probability becomes at most □ 

Theorem 3.3. The algorithm above preserves 2e- differential privacy. 

Note that the first instance of the exponential mechanism in our algorithm runs efficiently (since it 
is selecting from only (2) objects), but the second instance does not. We now describe how to achieve 
(e, (5)-differential privacy efficiently. 

First recall that using Karger's algorithm we can efficientl y (with high probability) generate all cuts of 
size at most fcOPT for any constant k. Indeed it is shown in |Kar93 1 that in a single run of his algorithm. 



any such cut is output with probability at least so that n^^^^Tuns of the algorithm will output all such 
cuts except with an exponentially small probability. 

Our efficient algorithm works as follows: in step 4 of Algorithm]^, instead of sampling amongst all possible 
cuts, we restrict attention to the set of cuts generated in n7 runs of Karger's algorithm. We claim that the 
output distribution of this algorithm has statistical distance 0{l/n^) from that of Algorithm]^, which would 
imply that we get (e, 0(;^))-differential privacy. 

Consider a hypothetical algorithm that generates the cut (S, S'') as in Algorithm ^ but then outputs 
FAIL whenever this cut is not in the set of cuts generated by runs of Karger's. Wc first show that the 
probability that this algorithm outputs FAIL is 0{-^). As shown above, OPT(G'U Hi) is at least 41nn/e 
except with probability Conditioned on this, the cut chosen in Step 4 has cost at most 30PT(G'U Hi) 
except with probability . Since each such cut is in the sample except with exponentially small probability, 
the claim follows. Finally, note that this hypothetical algorithm can be naturally coupled with both the 
algorithms so that the outputs agree whenever the former doesn't output FAIL. This implies the claimed 
bound on the statistical distance. We remark that wc have not attempted to optimize the running time here; 
both the running time and the value of 6 can be improved by choosing a larger constant (instead of 8) in 
Step 3, at the cost of increasing the additive error by an additional constant. 

3.1 Lower Bounds 

We next show that this additive error is unavoidable for any differentially private algorithm. The lower 
bound is information-theoretic and thus applies also to computationally inefficient algorithms. 

Theorem 3.4. Any e- differentially private algorithm for min-cut must incur an expected additive Q{liin/e) 
cost over OPT, for any e £ (31nn/n, j^). 

Proof. Consider a In n/3e- regular graph G ~ [V, E) on n vertices such that the minimum cuts are exactly 
those that isolate a single vertex, and any other cut has size at least (In n/2e) (a simple probabilistic argument 
establishes the existence of such a G; in fact a randomly chosen In ri/Se- regular graph has this property with 
high probability). 

Let M be an e-differentially private algorithm for the min-cut. Given the graph G, M outputs a partition 
of V. Since there are n = \V\ singleton cuts, there exists a vertex v such that the mechanism AI run on G 
outputs the cut ({u}, V \ {v}) with probability at most I/n, i.e. 

Pr\M(V, E) = {{v], V \ {v]) < -. 

n 
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Now consider the graph G' = {V,E'), with the edges incident on v removed from G, i.e. E' = E\{e 
V G e}. Since M satisfies e-difFerential privacy and E and E' differ in at most lnn/3e edges, 



Pr[M{V,E') = {{v},V\{v})] < 

Thus with probabihty (1 ^r), M{G') outputs a cut other than the minimum cut ({w}, V \ {v}). But 

all other cuts, even with these edges removed, cost at least (lnn/6e). Since OPTis zero for G' , the claim 
follows. □ 

4 Private /c-Median 

We next consider a private version of the metric fc-median problem: There is a pre-specified set of points 
V and a metric on them, d : V y. V ^ M.. There is a (private) set of demand points D C V. We wish 
to select a set of medians F C V with |F| = k to minimize the quantity cost(F) = J2veD ^(''^'^ ^) where 
d{v,F) = min/gi? c?(z;, /). Let A = ma.Xu,vev d{u,v) be the diameter of the space. 

As we show in Section [4.1| , any privacy-preserving algorithm for fc-median must incur an additive loss of 
f2(A • k\ii{n/k)/e), regardless of computational constraints. We observe that running the exponential mech- 
anism to choose one of the (^) subsets of medians gives an (computationally inefficient) additive guarantee. 

Theorem 4.1. Using the exponential mechanism to pick a set of k facilities gives an 0{{^poly{n))-time 
e- differentially private algorithm that outputs a solution with expected cost OPT + 0(A:Alogn/e). 

We next give a polynomial-time algorithm that gives a slightly worse approximation guarantee. Our 
algorithm is based on the local search algorithm of Arya et al. | AGK+O^ I . We start with an arbitrary set of 



k medians, and use the exponential mechanism to look for a (usually) improving swap. After running this 
local search for a suitable number of steps, we select a good solution from amongst the ones seen during the 
local search. The following result shows that if the current solution is far from optimal, then one can find 
improving swaps. 



Theorem 4.2 (Arya ct al. |lAGK+04| ). For any set F C_ V with \F\ ~ k, there exists a set of k swaps 



{xi,yi), {xk,yk) such that Y!l^^{cost{F) - cost{F - {.t,} + {yj)) > cost{F) - 50PT. 

Corollary 4.3. For any set F (-V with \F\ ~ k, there exists some swap {x,y) such that 

, , , , , cost(F) - 50PT 
cost{F) ~ cost{F - {x^} + {y^}) > . 



Algorithm 2 The fc-Median Algorithm 
1: Input: V, Demand points D C V, fc,e. 
2: let FiCV arbitrarily with |Fi| = fc, e' ^ e/{2A{T + 1)). 
3: for i = 1 to T do 

4; Select (x, y) E Fi X {V \ Fi) with probability proportional to exp(— e' x cost{Fi — {x} + {y}))- 
5: let F,+i<^ F,-{x} + {y}. 
6: end for 

7: Select j from {1, 2, . . . , T} with probability proportional to cxp(— e' x cost(Fj)). 

8: output Fj. 



Theorem 4.4. Setting T = Gfclnn and e' = e/(2A(r+ 1)), the k-median algorithm provides e- differential 
privacy and except with probability 0{1/ poly{n)) outputs a solution of cost at most 60PT + 0(Afc^ log^ '^/c)- 

Proof. We first prove the privacy. Since the cost function has sensitivity A, Step 4 of the algorithm preserves 
2e'A differential privacy. Since Step 4 is run at most T times and privacy composes additively, outputting 
all of the T candidate solutions would give us (2e'AT) differential privacy. Picking out a good solution from 
the T candidates costs us another 2e'A, leading to the stated privacy guarantee. 
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We next show the approximation guarantee. By Corollary 4.3, so long as cost(Fj) > 60PT, there exists a 



swap (cc, y) that reduces the cost by at least cost(Fi)/6fc. As there are only possible swaps, the exponential 



mechanism ensures through (2.4) that we are within additive 41nn/e' with probability at least 1 — 1/n . 
When cost(F,) > 60PT + 24fclnn/e', with probability 1 - l/n^ we have cost(F,+i) < (1 - l/6fc) x cost(Fi). 

This multiplicative decrease by (1 — l/6fc) applies for as long as cost(Fj;) > 60PT + 24fcln7i/e'. Since 
cost(Fo) < nA, and nA(l — l/6fc)^ < A < 24fcln?7./e', there must exist an i < T such that cost(Fi) < 
60PT + 24fclnn/e', with probability at least (1 - T/n'^). 

Finally, by applying the exponential mechanism again in the final stage, we select from the Fi scoring 
within an additive 41nn/e' of the optimal visited Fi with probability at least 1 — again by (p^). 

Plugging in the value of e', we get the desired result. Increasing the constants in the additive term can drive 
the probability of failure to an arbitrarily small polynomial. □ 

4.1 A;-Median Lower Bound 

Theorem 4.5. Any t- differentially private algorithm for the k-median problem must incur cost OPT + ri(A- 
fcln(n/fc)/e) on some inputs. 

Proof. Consider a point set V = [n] x [L] of uL points, with L = ln(n/fc)/10e, and a distance function 
d{{i, j), {i' , j')) = A whenever i ^ i! and d{{i, j), {i, j')) = 0. Let M be a differentially private algorithm 
that takes a subset D C V and outputs a set of k locations, for some k < j. Given the nature of the metric 
space, we assume that M outputs a fc-subset of [n]. For a set A C [n], let = A x [L]. Let A be a size-fc 
subset of V chosen at random. 

We claim that that Ea,m[\M{Da) O A\] < ^ for any e-differentially private algorithm M. Before we 
prove this claim, note that it implies the expected cost of M{Da) is f x AL, which proves the claim since 
OPT = 0. 

Now to prove the claim: define := ■|E,4.j\/[|yl n M(Da)\]. We can rewrite 

• = Ea,m[\A n MiDA)\] = k ■ ^^e[n]^A\{^},M[heMiDA)] 

Now changing A to A' := A\{i} + {i'} for some random i' requires altering at most 2L elements in Da', which 
by the differential privacy guarantee should change the probability of the output by at most e^^^ = (n/fc)^/^. 
Hence 

But the expression on the left is just k/n, since there at at most k medians. Hence (j) < (fc/n)^/'^ < 1/2, 
which proves the claim. □ 

Corollary 4.6. Any 1- differentially private algorithm for uniform facility location that outputs the set of 
chosen facilities must have approximation ratio ^{^/n). 

Proof. We consider instances defined on the uniform metric on n points, with d{u,v) = 1 for all u,v, and 
facility opening cost / = Consider a 1-diffcrcntially private mechanism M when run on a randomly 

chosen subset A of size k ~ ^Jn. Since OPT is fc/ = 1 for these instances, any o(Y^)-approximation must 
select at least | locations from A in expectation. By an argument analogous to the above theorem, it follows 
that any differentially private M must output n/20 of the locations in expectation. This leads to a facility 
opening cost of VL{^/n). □ 

4.2 Euclidean Setting 

Feldman ei al. \ FFKN09[ | study private coresets for the fc-median problem when the input points are in 



Jft''. For P points in the unit ball in 3?'^, they give coresets with (1 + e) multiplicative error, and additive 
errors about 0(/c^d^ log^ P) and 0(16fcc?)^'*(i'^/^ logPlogdfc) respectively for their inefficient and efficient 
algorithms. Since Euclidean fc-median has a PTAS, this leads to fc-median approximations with the same 
guarantees. We can translate our results to their setting by looking at a (l/P)-net of the unit ball as the 
candidate set of n-points, of which some may appear. This would lead to an inefficient algorithm with 



additive error 0{kdlog P), and an efficient algorithm with additive error 0{k^(P\og^ P). The latter has a 
multiplicative error of 6 and hence our efficient algorithms are incomparable. Note that coresets are more 
general objects than just the fc-median solution. 

5 Vertex Cover 

We now turn to the problem of (unweighted) vertex cover, where we want to pick a set S of vertices of 
minimal size so that every edge in the graph is incident to at least one vertex in 5. In the privacy-preserving 
version of the problem, the private information wc wish to conceal is the presence of absence of each edge. 

Approximating the Vertex Cover Size . As mentioned earlier, even approximating the vertex cover size was 



shown to be polynomially inapproximable under the constraint of functional privacy |HKKN01, BCNW06]. 
On the other hand, it is easy to approximate the size of the optimal vertex cover under differential privacy: 
twice the size of a maximum matching is a 2-approximation to the optimal vertex cover, and this value only 
changes by at most two with the presence or absence of a single edge. Hence, this value plus Laplace(2/e) 



noise provides e-differential privacy |DMNS06|. (Here it is important that we use maximum rather than just 
maximal matchings, since the size of the latter is not uniquely determined by the graph, and the presence or 
absence of an edge may dramatically alter the size of the solution.) Interestingly enough, for weighted vertex 



cover with maximum weight lUmax (which we study in Section 5.2), we have to add in Lap('u;max/e) noise 
to privately estimate the weight of the optimal solution, which can be much larger than OPT itself. The 
mechanism in Section 5.2| avoids this barrier by outputting an implicit representation of the vertex cover. 



and hence gives us a 0(l/e) multiplicative approximation with e-differential privacy. 

The Vertex Cover Search Problem. If wc want to find a vertex cover (and not just estimate its size), how can 
we do this privately? In covering problems, the (private) data imposes hard constraints on the a solution, 
making them quite different from, say, min-cut. Indeed, while the private data only influences the objective 
function in the min-cut problem, the data determines the constraints defining feasible solutions in the case 
of the vertex cover problem. This hard covering constraint make it impossible to actually output a small 
vertex cover privately: as noted in the introduction, any differentially private algorithm for vertex cover that 
outputs an explicit vertex cover (a subset of the n vertices) must output a cover of size at least n — 1 with 
probability 1 on any input, an essentially useless result. 

In order to address this challenge, we require our algorithms to output an implicit representation of a 
cover: wc privately output an orientation of the edges. Now for each edge, if we pick the endpoint that it 
points to, we clearly get a vertex cover. Our analysis ensures that this vertex cover has size not much larger 
than the size of the optimal vertex cover for the instance. Hence, such an orientation may be viewed as a 
privacy-preserving set of instructions that allows for the construction of a good vertex cover in a distributed 
manner: in the case of the undercover agents mentioned in the introduction, the complete set of active 
dropoff sites (nodes) is not revealed to the agents, but an orientation on the edges tells each agent which 
dropoff site to use, if she is indeed an active agent. Our algorithms in fact output a permutation of all 
the vertices of the graph. Each edge can be considered oriented towards the endpoint appearing earlier in 
the permutation. Our lower bounds apply to the more general setting where we are allowed to output any 
orientation (and hence are stronger). 

5.1 The Algorithm for Unweighted Vertex Cover 

Our (randomized) algorithm will output a permutation, and the vertex cover will be defined by picking, for 
each edge, whichever of its cndpoints appears first in the permutation. We show that this vertex cover will 
be (2 4- 0(l/e))-approximatc and e-differentially private. Our algorithm is based on a simple (non-private) 



2-approximation to vertex cover |Pit85| that repeatedly selects an uncovered edge uniformly at random, and 
includes a random endpoint of the edge. We can view the process, equivalently, as selecting a vertex at 
random with probability proportional to its uncovered degree. We will take this formulation and mix in a 
uniform distribution over the vertices, using a weight that will grow as the number of remaining vertices 
decreases. 

Let us start from Gi = G, and let Gi be the graph with n — i + 1 vertices remaining. We will write 
dy{G) for the degree of vertex v in graph G. The algorithm ALG in step i chooses from the 7i — z + 1 
vertices of Gi with probability proportional to dy{Gi) + Wi, for an appropriate sequence (wi). Taking 
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Wi = (4/e) X in/{n — i + 1))^^^ provides e-difTcrcntial privacy and a (2 + 16/e) approximation factor, the 
proof of which will follow from the forthcoming Theorem 5.1 and Theorem 5.2| . 

As stated the algorithm outputs a sequence of vertices, one per iteration. As remarked above, this 
permutation defines a vertex cover by picking the earlier occurring end point of each edge. 



Algorithm 3 Unweighted Vertex Cover 



1: let n <- \V\, Vi ^V,Ei^ E. 
2: for i = 1, 2, . . . , n do 

3: let 4- (4/e) X yjn/{n -i + l). 

4: pick a vertex v ^Vi with probability proportional to dEi{v) + Wi 
5: output V. let Vi+i ^Vi\ {v}, Ei+i ^ E,\ {{v] x V^). 
6: end for 



Theorem 5.1 (Privacy). ALG satisfies e-differential privacy for the settings of Wi above. 

Proof. For any two sets of edges A and B, and any permutation tt, let di be the degree of the i*'* vertex in 
the permutation tt and let be the remaining edges, both ignoring edges incident to the first z — 1 vertices 
in TT. 

FrjALGjA) = tt] ^ -A- (w, + d,{A))/{{n - i + \)w, + 2m,(A)) 
Vv\ALG{B) = ^] " \}^ {w, + d,{B))l{{n ~ i + l)w, + 2m,{B)) ' 

When A and B differ in exactly one edge, di{A) = di{B) for all i except the first endpoint incident to the edge 
in the difference. Until this term mi[A) and mi{B) differ by exactly one, and after this term mi{A) = mi{B). 
The number of nodes is always equal, of course. Letting j be the index in tt of the first endpoint of the edge 
in difference, we can cancel all terms after j and rewrite 

Pr[ALG{A) = tt] _ Wj + dj{A) {n-i + l)wi + 2m^{B) 

Pr[ALG{B) = tt] ~ Wj+dj{B) ^ M {n - i + l)w^ + 2m^(A) ' 

An edge may have arrived in A, in which case mi{A) = mi{B) + 1 for all i < j, and each term in the product 
is at most one; moreover, dj{A) = dj(B) + 1, and hence the leading term is at most 1 + l/wj < exp(l/wi), 
which is bounded by exp(e/2). 

Alternately, an edge may have departed from A, in which case the lead term is no more than one, but 
each term in the product exceeds one and their product must now be bounded. Note that mi(A) + 1 = mi(B) 
for all relevant i, and that by ignoring all other edges we only make the product larger. Simplifying, and 
using 1 + .T < exp(a;), we see 

n jn - i + l)w., + 2m,{B) ^ (n-i + l)m, + 2 ^ „ / 2 \ ^ I ^ 2 

(n - i + l)wi + 2mi(A) " (n - i + l)w, + ^ V (n - i + l)wi J " '^^^ \ ^ (n - i + l)w, 

t<j i<j i<j ^ ^ ' ' \i<j 

The Wi are chosen so that 2/{n — i + l)wi = (e/^/n) J2i V^V* is at most e. □ 

Theorem 5.2 (Accuracy), for a// G, E[^LG(G)] < {2 + 2 avg^^,^w^) x \OPT{G)\ < (2 + 16/e)|0PT(G)|. 

Proof. Let OPT(G) denote an arbitrary optimal solution to the vertex cover problem on G. The proof is 
inductive, on the size n of G. For G with \OPT{G)\ > n/2, the theorem holds. For G with \OPT{G)\ < n/2, 
the expected cost of the algorithm is the probability that the chosen vertex v is incident to an edge, plus the 
expected cost of ALG{G \ v). 

E[^LG(G)] = T>r[v incident on edge] + E^[E[ALG{G \ v)]] . 

We will bound the second term using the inductive hypothesis. To bound the first term, the probability that 
V is chosen incident to an edge is at most {2mwn + 2m) / {nwn + 2m) , as there are at most 2m vertices incident 
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to edges. On the other hand, the probabihty that we pick a vertex in OPT{G) is at least {\OPT{G)\wn + 
m)/{nwn + 2m). Since \OPT{G)\ is non-negative, we conclude that 



Pr[v incident on edge] < (2 + 2w„)(TO/(nu;„ + 2m)) < (2 + 2u)„)Pr[u S OPT{G)] 
Since l[v e OPT{G)] < \OPT{G)\ - \OPT{G\v)\, and using the inductive hypothesis, we get 

E[ALG{G)] < (2 + 2m„) x {\OPT{G)\ -Ey[\OPT{G\v)\]) + (2 + 2avgw,) x E4\OPT{G\v)\] 
= (2 + 2w„) X |OPr(G)| + (2avgw, - 2w„) x E^[\0 PT {G \ v)\] 

The probability that v is from an optimal vertex cover is at least {\OPT(G)\wi+'m)/{nwi+2'm), as mentioned 
above, and (using {a + b)/{c + d) > min{a/c, is at least mm{\OPT{G)\/n,l/2} = \OPT{G)\/n, since 

\OPT{G)\ < n/2 by assumption. Thus E[\OPT{G\v)\] is bounded above by (1 - 1/n) x \OPT{G)\, giving 

E[ALG{G)] < {2 + 2w„) X \OPT{G)\ + {2^wgw^-2wn) X {l~l/n) X \OPT{G)\ . 

Simplification yields the claimed results, and instantiating Wi completes the proof. □ 

Hallucinated Edges. Here is a slightly different way to implement the intuition behind the above algo- 
rithm: imagine adding 0(l/e) "hallucinated" edges to each vertex (the other endpoints of these hallucinated 
edges being fresh "hallucinated" vertices), and then sampling vertices without replacement proportional to 
these altered degrees. However, once (say) n/2 vertices have been sampled, output the remaining vertices 
in random order. This view will be useful to keep in mind for the weighted vertex cover proof. (A formal 



analysis of this algorithm is in Appendix A .) 
5.2 Weighted Vertex Cover 

In the weighted vertex cover problem, each vertex V is assigned a weight w{v)^ and the cost of any vertex 
cover is the sum of the weights of the participating vertices. One can extend the unweighted 2-approximation 
that draws vertices at random with probability proportional to their uncovered degree to a weighted 2- 
approximation by drawing vertices with probability proportional to their uncovered degree divided by their 
weight. The differentially private analog of this algorithm essentially draws vertices with probability pro- 
portional to 1/e plus their degree, all divided by the weight of the vertex; the algorithm we present here is 
based on this idea. 

Define the score of a vertex to be s{v) = l/w{v). Our algorithm involves hallucinating edges: to 
each vertex, we add in 1/e hallucinated edges, the other endpoints of which are imaginary vertices, whose 
weight is considered to be oo (and hence has zero score). The score of an edge e ~ (u, w) is defined to be 
s(e) = s{u) + s{v); hence the score of a fake edge / incident on u is s{f) — s(u), since its other (imaginary) 
endpoint has infinite weight and zero score. We will draw edges with probability proportional to their 
score, and then select an endpoint to output with probability proportional to its score. In addition, once a 
substantial number of vertices of at least a particular weight have been output, we will output the rest of 
those vertices. 

Assume the minimum vertex weight is 1 and the maximum is 2"^. For simplicity, we round the weight 
of each vertex up to a power of 2, at a potential loss of a factor of two in the approximation. Define the 
j*'' weight class Vj to be the set of vertices of weight 2-'. In addition, we will assume that |Vj| = |V,-|_i| for 
all weight classes. In order to achieve this, we hallucinate additional fake vertices. We will never actually 
output a hallucinated vertex. Let Nj denote \Vj\. 

We imagine the z*'' iteration of the outer loop of the algorithm as happening at time i; note that one 
vertex is output in Step 3, whereas multiple vertices might be output in Step 6. Let be the sum of the 
scores of all real vertices not output before time i, and fhi be the sum of the scores of all real edges not 
covered before time i. 

5.2.1 Privacy Analysis 

Theorem 5.3. The weighted vertex cover algorithm preserves 0(e) differential privacy. 
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Algorithm 4 Weighted Vertex Cover 



1: while not all vertices have been output do 

2: pick an uncovered (real or hallucinated) edge e = (u, v) with probability proportional to s(e). 

3: output endpoint u G e with probability proportional to s{u). 

4: while there exists some weight class Vj such that the number of nodes of class j or higher that we've 

output is at least Nj/2 = \Vj\/2 do 

5: pick the smallest such value of j 

6: output ("dump") all remaining vertices in Vj in random order. 

7: end while 

8; end while 



Proof. Consider some potential output tt of the private vertex cover algorithm, and two weighted vertex 
cover instances A and B that are identical except for one edge e — (p, q) . Let p appear before q in the 
permutation tt; since the vertex sets are the same, if the outputs of both A and B are tt, then p will be 
output at the same time t in both executions. Let Vt be the vertex output in Step 3 at time t in such an 
execution; note that either p = vt, or p is output in Step 6 after vt is output. 

The probability that (conditioned on the history) a surviving vertex v is output in Step 3 of the algorithm 
at time i is: 

Ecdgcs e Pr[pick e] . Pr[output . | pick e] = ' = ■ 

Since we compare the runs of the algorithm on A and B which differ only in edge e, these will be identical 
after time t when e is covered, and hence 

Pr[M(A)=-K] _ (dA(vt) + l/i)s{vt) TT / ihf+ni/c \ 
Pr[A/(B)=7r] (dB(vt) + l / e)s(vt) ll»<t \mt+n,/<i ) ' 

Note that if the extra edge e ^ A \ B then dA{vt) < '^^(I't) + 1 and mf < fhf, so the ratio of the 
probabilities is at most 1 + e < exp(e). Otherwise, the leading term is less than 1 and fhf = fhf + s(e), and 
we get 

Let Tj be the time steps i < t where vertices in Vj arc output in tt. Letting 2^ be the weight of the 
lighter endpoint of edge e, wc can break the sum X]i<t ff" ^^'^ pieces and analyze each separately: 

For the first partial sum, for some j < j* , let y^,v-T = ^ + ^ + . . . + J— such that in > ii > ■ ■ ■ > ix- 

We claim that > 2^-' Nj, /2. Indeed, since e has not yet been covered, we must have output fewer than 
Nj, /2 vertices from levels j* or higher, and hence at least Nj* /2 remaining vertices from Vj* contribute to 

In each time step in Tj, at least one vertex of score is output, so we have that n.i^ > 2^^ Nj<- /2+£-2^^ . 
Hence 

V i< -I , i h H i 

Z^ieTj m — 2-i' Nj,/2 ^ 2-i' Nj, /2+2-i ^ • • • ^ 2-^* Nj, /2+Nj 2-3 ' 

Defining 9 — 2^-' • Nj- /2, the expression above simplifies to 



\e ^ e+i 



2Mi + ^ + ... + ^) <2^1n(^l -2nn(l^^ 



Now using the assumption on the size of the weight classes, we have Nj < Nj* ==^ Nj/d < 2^ and 
hence J^^^^^ J- < {j* - j + 2)2^, for any j < j* . Finally, 

E,<,. E.eT, i < E,<^r J + 2)2^- = 0(2^- ). 



We now consider the other partial sum X]j>j- X^iGT ^-"^y such vahie of i, we know that Ui > 

Nj* /2. Moreover, there are at most Nj* /2 times when we output a vertex from some weight class 
j > j* before we output all of Vj' ; hence there are at most Nj* /2 terms in the sum, each of which is at most 
2-3* ]v- /2 ' Si^™S ^ bound of 2^ on the second partial sum. Putting the two together, we get that 

using the fact that s(e) < 2 • 2^^ , since the lighter cndpoint of e had weight 2^ . □ 
5.2.2 Utility Analysis 

Call a vertex v interesting if it is incident on a real uncovered edge when it is picked. Consider the weight 
class Vj: let ij C Vj be the set of interesting vertices output due to Steps 3, and ij C Vj be the set of 
interesting vertices of class j output due to Step 6. The cost incurred by the algorithm is J^j '^''il^l + I)- 

Lemma 5.4. EE^-2^'|/j|] < ffi±£lOPT 

Proof. Every interesting vertex that our algorithm picks in Steps 3 has at least one real edge incident on 
it, and at most ^ hallucinated edges. Conditioned on selecting an interesting vertex v, the selection is due 
to a real edge with probability at least 1/(1 + ^). One can show that the (non-private) algorithm A that 
selects only real edges is a 2-approximation [ Pit85| . On the other hand each vertex in ij can be coupled to 



a step of A with probability e/(l + e). Since we rounded up the costs by at most a factor of two, the claim 
follows. □ 



Lemma 5.5. E[|/2|] < 6EE,,>, l^l] 



Proof. Let tj denote the time that class j is dumped. Recall that by (5.2.1). we pick a surviving vertex v 
with probability oc {d{v) + 7) • s{v) at each step. This expression summed over all uninteresting vertices is 
Uj'yjVj' is at most (1/e) J2j'>j 2""' ^j' 2^^^^Nj/e. On the other hand, at each step before time tj, all 
the interesting vertices in Ij are available and the same expression summed over them is at least 2~^\Ij\/e. 
Thus for any t < tj, conditioned on outputting a vertex vt € ^j'>jVj' in Step 3, the probability that it 

is interesting is at least (|7:^|2-j+2i-j'w )/g — 3lv" (using |/|| < Nj). Now since we output Nj/2 vertices 

from ^j'>jVj' in Step 3 before time tj, we conclude that ^[J2j'>j \^j'\ \ — '2' ^ JW' ^ ^~ir- Taking 

expectations completes the proof. □ 

We can now compute the total cost of all the interesting vertices dumped in Steps 6 of the algorithm. 
E[cost(U, If)] - 2^' m!\] < 6 2^' Er>, E[I4 1] < 6 E[|4 |] 2^'+' < 12 • E[cost(U,. Ij)]. 



Finally, combining this calculation with Lemma 5.4 , we conclude that our algorithm gives an O(-) approxi- 
mation to the weighted vertex cover problem. 

5.3 Vertex Cover Lower Bounds 

Theorem 5.6. Any algorithm for the vertex cover problem that prescribes edge- orientations with e- differential 
privacy must have an f2(l/e) approximation guarantee, for any e S (i, 1]. 

Proof. Let V = {1, 2, . . . , [^] }, and let M be an e-diffcrentially private algorithm that takes as input a 
private set E of edges, and outputs an orientation Me '-V xV ^ V, with Me{u,v) £ {u,v} indicating 
to the edge which endpoint to use. Picking two distinct vertices u ^ v uniformly at random (and equating 
{u,v) with (v,u)), we have by symmetry: 



Pr„,„[M0((u,u)) ¥=u] = h 



2 ■ 
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Let = (y, {u} X (F \ {u})) be the star graph rooted at u. Since and differ in at most 5^ ^ 1 < 7 
edges and M satisfies e-difFerential privacy, we conclude that 

Pr„,„[M,„((u,w)) ^u] > ^. 

Thus the expected cost of M when input a uniformly random is at least ^ x [^] , while OPT(*„) is 1. 
We can repeat this pattern arbitrarily, picking a random star from each group of 1/e vertices; this results in 
graphs with arbitrarily large vertex covers where M incurs cost 1/e times the cost. □ 

6 Set Cover 

We now turn our attention to private approximations for the Set Cover Problem; here the set system {U,S) 
is public, but the actual set of elements to be covered i? C J7 is the private information. As for vertex cover, 
we cannot explicitly output a set cover that is good and private at the same time. Hence, we again output a 
permutation over all the sets in the set system; this implicitly defines a set cover for R by picking, for each 
element R, the first set in this permutation that contains it. Our algorithms for set cover give the slightly 
weaker (e, (5)-privacy guarantees. 

6.1 Unweighted Set Cover 

We are given a set system {U,S) and must cover a private subset R C U. Let the cardinality of the set 
system be \S\ = m, and let \U\ = n. We first observe a computationally inefficient algorithm. 

Theorem 6.1. The exponential mechanism, when used to pick a permutation of sets, runs in time 0{m\poly{n)) 
and gives an O {\og{em/ OPT) /e)- approximation. 

Proof. A random permutation, with probability at least (g'pj) ^ has all the sets in OPT before any set in 
OPT^ Thus the additive error is 0(log {(^j) /e). □ 

The rest of the section gives a computationally efficient algorithm with slightly worse guarantees: this is 
a modified version of the greedy algorithm, using the exponential mechanism to bias towards picking large 
sets. 

Algorithm 5 Unweighted Set Cover 
1; Input: Set system {U,S), private R C U oi elements to cover, e,6. 
2: let i ^ 1, R., ^R,S^^ S. e' ^ e/21n(|). 
3: for i = 1, 2, . . . , m do 

4: pick a set S from Si with probability proportional to exp(e'|S' O Ri\). 
5: output set 5*. 

6: Ri+i ^ Ri \ S, Si+i ^ Si — {S}. 
7: end for 



6.1.1 Utility Analysis 

At the beginning of iteration i, say there are rui = m^i + l remaining sets and — \Ri\ remaining elements, 
and define Li = laaxses \S D Ri\, the largest number of uncovered elements covered by any set in S. By a 
standard argument, any algorithm that always picks sets of size Li/2 is an O(lnri) approximation algorithm. 

Theorem 6.2. The above algorithm achieves an expected approximation ratio of 0{\n7i + ^^^) = 0(lnn + 

In m ln(e/(5) \ 

Proof. As there is at least one set containing Li elements, our use of the exponential mechanism to select sets 



combined with Equation 2.4 ensures that the probability we select a set covering fewer than Li — Slnm/e 
elements is at most 1/m?. While Li > 61nm/e, with probability at least (1 — 1/m) we always select sets that 
cover at least Li/2 elements, and can therefore use no more than 0(OPTlnri,) sets. Once Li drops below 
this bound, we observe that the number of remaining elements \Ri\ is at most OPT • Li. Any permutation 
therefore costs at most an additional O(0PTlnTO/e'). □ 
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6.1.2 Privacy 

Theorem 6.3. The unweighted set cover algorithm preserves {e,S) differential privacy for any e G (0,1); 
and S < 1/e. 

Proof. Let A and B be two set cover instances that differ in some element /. Say that is the collection 
of sets containing /. Fix an output permutation tt, and write Sij{A) to denote the size of set Sj after the 
first i — 1 sets in tt have been added to the cover. 

Pr[M{A) ^ n] ^ " / exp(£' ■ s,,^. exp(e^ ■ s,,, (A))) \ 

Pr[M(i3) = tt] ~ l\ l^exp(6' • s.,., exp(e' • J 

exp(e' • St,.,{B)) ' fj 1^^, exp(e' • s.,,,{A)) ) 

where t is such that S-^^ is the first set containing / to fall in the permutation tt. After t, the remaining 
elements in A and B are identical, and all subsequent terms cancel. Moreover, except for the t*'^ term, the 
numerators of both the top and bottom expression cancel, since all the relevant set sizes are equal. If A 
contains / and B does not the first term is exp(e') and the each term in the product is at most 1. 

Now suppose that B contains / and A does not . In this case, the first term is exp(— e') < 1. Moreover, 
in instance B, every set in is larger by 1 than in A, and all others remain the same size. Therefore, we 
have: 

Pr[M{A) ^ n] ^ ^ ( {cM^') ^ 1) ' Ejgg^ exp(e^ ■ + exp(e' ■ s,.j{A)) \ 



n 



Pr[A/(B)=^] - E,exp(e'-Si,,(^)) ) 

t 

= n(l + (exp(0-l)-P.(^)) 

where Pi{A) is the probability that a set containing / is chosen at step i of the algorithm running on instance 
A, conditioned on picking the sets S'tti, • . • , S'7ri_i in the previous steps. 

For an instance A and an clement / G A, we say that an output a is q-badii pi{A)\{I uncovered at step i) 
(strictly) exceeds g, where Pi{A) is as defined above. We call a permutation q-good otherwise. We first con- 
sider the case when the output tt is (In (5^^)-good. By the definition of t, we have 



t-i 

E 

i=l 



Continuing the analysis from above. 



Pr[M(f)=^l - ri^^P((^^P(^') - < cxp(2e'^p.(A)) 

< exp(2e'(ln(i) +pt(A))) < cxp(2e'(ln(l) + 1)). 

Thus, for any (ln(5^^)-good output tt, we have pr[M(B)-7r] — cxp(e). We can then invoke the following 
lemma, proved in appendix ^ 

Lemma 6.4. For any set system {U,S), any instance A and any I ^ A, the probability that the output n of 
the algorithm above is q-bad is bounded by exp(— g). 
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Thus for any set V of outcomes, we have 
Pr[M(A) ^V] = Vy[M{A) = tt] 



Pr[A/(^) = tt] + Y Pr[M(^) = tt] 

Trev-.TT is (in(5-i)-good -rrev-.TT is (in<5-i)-bad 

< ^ exp(e)Pr[A/(S) tt] + (5 
Trev-.TT is (in(5-i)-good 

< exp(e)Pr[M(B) e P] +5. 



□ 



Corollary 6.5. For e < 1 and S = l/poly(n), there is an 0( )- approximation algorithm for the 
unweighted set cover problem preserving (e, 5) -differential privacy. 

6.2 Weighted Set Cover 

We are given a set system {U,S) and a cost function C : 5 ^ R. We must cover a private subset R C U. 
W.l.o.g., let minses C{S) = 1, and denote max^g^ C(S') = W. Let the cardinality of the set system be 
|iS| = m, and let \U\ — n. 

Algorithm 6 Weighted Set Cover 
1: let z^l,R.^R, 5. ^ 5, r, ^ n, e' = j^^, T = e(l2£IIi±Mnm) 
2: while > 1/W^ do 

3: pick a set S" from Si with probability proportional to exp (e'( jS* n — r; • C{S) )) 

or halve with probability proportional to exp(— e'T) 
4: if halve then 

5: let r,+i ^ rj/2, ^ ^ 5^, i ^ i + 1 

6: else 

7: output set 5 

8: let Ri+i ^ i?i \ S, Si+i ^ S, - {S}, r,+i ^ r^, i ^ i + 1 
9: end if 
10: end while 

11: output all remaining sets in Si in random order 



Let us first analyze the utility of the algorithm. If i? = 0, the algorithm has cost zero and there is nothing 
to prove. So we can assume that OPT > 1. We first show that (whp) r; ^ Ri /OPT. 

Lemma 6.6. Except with probability l/poly(TO), we have ri > 20pt f'^^ '^^^ iterations i. 

Proof. Clearly ri ~ n> |i?i|/20PT. For n to fall below \Ri\/2, it must be in ( J^pl^- , and be halved in 

Step 6 of some iteration i. We'll show that this is unlikely: if at some iteration i, < ri < then 

we argue that with high probability, the algorithm will not output halve and thus not halve r,. Since all 
remaining elements Ri can be covered at cost at most OPT, there must exist a set S such that ^"^^^^'^ > 

and hence 15* D > C{S) ■ ^ > C{S) ■ ri. 

Hence Ui{S) := jS* n — • C{S) > in this case, and the algorithm will output S with probability 
at least proportional to 1, whereas it outputs halve with probability proportional to exp(— e'T). Thus, 
Pr[ algorithm returns halve ] < exp(— e'T) = 1/ poly (m log nVF). Since there are m sets in total, and 
r ranges from n to 1/VF, there are at most m + O(lognW^) iterations, and the proof follows by a union 
bound. □ 

Let us define a score function Ui{S) :~ \S D Ri\ — • C{S), and Ui(halve) := — T: note that in Step 4 
of om' algorithm, we output either halve or a set S, with probabilities proportional to exp(e'ui(-)). The 
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following lemma states that with high probability, none of the sets output by our algorithm have very low 
scores (since we are much more likely to output halve than a low-scoring set). 



Lemma 6.7. Except with probability at most 1/ poly(TO), Step 4 only returns sets S with Ui(S) > —2T. 

Proof. There are at most \Si\ < m sets S with score Ui{S) < —2T, and so one is output with probability 
at most proportional to mcxp(— 2Te). We will denote this bad event by B. On the other hand, halve is 
output with probability proportional to exp(— Te). Hence, Pr [halve] /Pr[;B] > exp(Te)/m, and so Pr[S] < 
TO/exp(re) < 1/ poly (m log nVF). Again there are at most m + 0{\ognW) iterations, and the lemma follows 
by a trivial union bound. □ 

We now analyze the cost incurred by the algorithm in each stage. Let us divide the algorithm's execution 
into stages: stage j consists of all iterations i where \Ri\ G jj^]- Call a set S interesting if it is incident 
on an uncovered element when it is picked. Let X, be the set of interesting sets selected in stage j, and C{Xj) 
be the total cost incurred on these sets. 

Lemma 6.8. Consider stages 1, . . . , j of the algorithm. Except with probability 1/ poly(TO), we can bound 
the cost of the interesting sets in stage 1, . . . ,j by: 

C{Ij>) < 4jOPT • (1 + 2T). 

3'<j 



Proof. By [Lemma 6.7| all the output sets have Ui{Si) > —2T whp. Rewriting, each St selected in a round 
j' < j satisfies 

< \S,nR.l + 2T ^ ^:iOPT _ ^ ^ ^^^^ 

ri n 



where the second inequality is whp, and uses Lemma 6.6 . Now summing over all rounds / < j, we get 

i'<i j'<j i s.t. SiGlji 

Consider the inner sum for any particular value of j': let the first iteration in stage j' be iteration ig — 
naturally Ri C Ri^ for any iteration i in this stage. Now, since Si H Ri C Ri^ and Si D Ri is disjoint 
from Si' n Ri' , the sum over \Si H Ri\ is at most \Ri„\, which is at most —rrr by definition of stage j'. 
Moreover, since we are only concerned with bounding the cost of interesting sets, each [SiH Ri\ > 1, and so 
\S^nRt\ + 2T < \S,nR^\{l + 2T). Putting this together, (|j2|) implies 

9i' + i OPT 71 
E C{Ir) < E X + = OPT (1 + 2T), 

which proves the lemma. □ 

Theorem 6.9 (UtiUty). The weighted set cover algorithm incurs a cost of 0{T log 77, OPT) except with 
probability 1/ poly(m). 

Proof. Since the number of uncovered elements halves in each stage by definition, there arc at most 1 + logn 
stages, which by Lemma 6.8 incur a total cost of at most 0(logn OPT • (1 + 2T)). The sets that remain and 
are output at the very end of the algorithm incur cost at most W for each remaining uncovered element; 
since n < 1/W at the end, Lemma 6.6 imphes that \Ri\ < 20PT /W (whp), giving an additional cost of at 
most 2 OPT. □ 

We can adapt the above argument to bound the expected cost by 0(T log n OPT). 

Theorem 6.10 (Privacy). For any S > 0, the weighted set cover algorithm preserves {e,S) differential 
privacy. 
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Proof. We imagine that the algorithm outputs a set named "HALVE" when Step 4 of the algorithm returns 
halve, and show that even this output is privacy preserving. Let A and B be two set cover instances that 
differ in some element /. Say that is the collection of sets containing /. Fix an output tt, and write 
Uij{A) to denote the score of ttj (recall this may be halve) after the first i — 1 sets in tt have been selected. 

Pr[M(A) ^ tt] _ A / exp(e' ■ (A))/(^^. exp(£' ■ u.^^jA))) \ ^ expj^' ■ ^t,.M)) A / Ej exp(£' ■ u,- , (g)) \ 
Pr[M{B) = t:] ^ l^exp(e' • u... (i?))/(E, exp(e' • u,^,{B))) ) cxp(e' • ^,,,^(5)) 'H l^E,exp(e' ' Ui,j{A)) J 

where t is such that St^^ is the first set containing / to fall in the permutation tt. After t, the remaining 
elements in A and B are identical, and all subsequent terms cancel. Moreover, except for the t*^ term, the 
numerators of both the top and bottom expression cancel, since all the relevant set sizes are equal. If A 
contains / and B does not the first term is exp(e') and the each term in the product is at most 1. Since 
e' < e, we conclude that in this case, for any set V of outputs, Pr[M{A) G V] < exp(e)Pr[A/(i3) e V]. 

Now suppose that B contains / and A does not . In this case, the first term is exp(— e') < 1. Moreover, 
in instance B, every set in is larger by 1 than in A, and all others remain the same size. Therefore, we 
have: 

Pr[A/(A) =7r] -A- /(exp(eO-l)-E,6S^exp(e'-u,,,(A)) + E,exp(e'-M^.,(A))\ * , 

Pr[Af (5) = .] ^ l\ [ E.eMe'-u^AA)) ) ~ - ^) ' pM) 

where Pi{A) is the probability that a set containing / is chosen at step i of the algorithm running on instance 
A, conditioned on picking the sets S'^j, . . . , S't._i in the previous steps. 

For an instance A and an element I € A,wc say that an output a is q-badil Ei uncovered at step i) 

(strictly) exceeds g, where Pi{A) is as defined above. We call a permutation q-good otherwise. We first con- 
sider the case when the output tt is (In (5~^)-good. By the definition of t, we have 

t-i 

^p,(^)<lnri. 

1=1 

Continuing the analysis from above, 

p,'}''jji:"l < U-MieMe) - 1)V.{A)) < exp {^^^'tpM)^ 

< exp (2e' {inS-^ +ptiA))) < cxp (2e' {inS-^ + l)) . 

Thus, for any (In (5^^)-good output tt, we have p^jj|||^j~^j < exp(e). 

Finally, as in the proof of Theorem |6.3|, we can use lemma |6.4| to complete the proof. □ 



6.3 Removing the Dependence on W 

We can remove the dependence of the algorithm on W with a simple idea. For an instance T ~ {U,S), let 
5^ = {5 e iS I C{S) £ {n^ , ??^+^] }. Let be the set of elements such that the cheapest set containing them 
is in . Suppose that for each j and each 5 € iS-' , we remove all elements that can be covered by a set of 
cost at most n^~^, and hence define S' to be 5* n {U^ U U^~^). This would change the cost of the optimal 
solution only by a factor of 2, since if we were earlier using S in the optimal solution, we can pick S' and at 
most n sets of cost at most n^~^ to cover the elements covered by S* \ S' . Call this instance I' = {U,S'). 

Now we partition this instance into two instances Ii and X2, where Xi = (Uj cvcnU-' ,S'), and where 
X2 = (Uj oddU-' , S'). Since we have just partitioned the universe, the optimal solution on both these instances 
costs at most 2 OPT(I). But both these instances Ii,l2 are themselves collections of disjoint instances, with 
each of these instances having uiinax/w^min ^ this immediately allows us to remove the dependence on 
W. Note that this transformation is based only on the set system {U,S), and not on the private subset R. 

Theorem 6.11. For any e G (0,1), 5 = 1/ poly(ri,), there is an 0{logn{\ogm + loglog n) / e)- approximation 
for the weighted set cover problem that preserves (e, 6) -differential privacy. 
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6.4 Lower bounds 

Theorem 6.12. Any e- differentially private algorithm that maps elements to sets must have approxi- 
mation factor Q{\ogm/e), for a set cover instance with m sets and {{logm)/e)^^^^ elements, for any 
e G (21ogm/m2o, 1). 

Proof. We consider a set system with \U\ ~ N and S a uniformly random selection of m size-Zc subsets of 
U. We will consider problem instances Si consisting of one of these m subsets, so OPT(S'i) = 1. Let M be 
an e-differentially private algorithm that on input T (- U , outputs an assignment / mapping each element 
in U to some set in S that covers it. The number of possible assignments is at most . The cost on input 
T under an assignment / is the cardinality of the set f{T) = UegT/(e). 

We say assignment / is good for a subset T C J7 if its cost |/(r)| is at most I = ^- We first show that 
any fixed assignment f : U ^ [m], such that < k for all j, is unlikely to be good for a randomly 

picked sizc-fc subset T of U. The number of ways to choose I sets from among those with non-empty f~^{-) 

is at most (^). Thus the probability that / is good for a random size-A: subset is at most (^) {^)^ ■ Setting 
k = N^/^°, and ? = |, this is at most 

Let m = 2^"^^ . The probability that / is good for at least t of our m randomly picked sets is bounded by 

Thus, with probability at most 2"^'^^°^^/^, a fixed assignment is good for more than N oi m randomly 
chosen size-Zc sets. Taking a union bound over ~ 2?^^^ possible assignments, the probability that any 
feasible assignment / is good for more than N sets is at most 2~^'='°s's/i6 Thus there exists a selection of 
size-Zc sets 5*1, ... , Sm such that no feasible assignment / is good for more than N of the S'i's. 

Let PM(%){Si) be the probability that an assignment drawn from the distribution defined by running M 
on the the empty set as input is good for Si. Since any fixed assignment is good for at most of the m sets, 
the average value of pm(0) is at most N /m. Thus there exists a set, say Si such that PM((t)){Si) < N/m. Since 
\Si\ = k and M is e-differentially private, Pm{Si){Si) < £xp{ek)pn[i^^-j{Si) < ^. Thus with probability at 
least half, the assignment M picks on is not good for 5*1. Since OPT(S'i) = 1, the expected approximation 
ratio of M is at least 1/2 = 

Additionally, one can take s distinct instances of the above problem, leading to a new instance on s • iV 
elements and s ■ m sets. OPT is now s, while it is easy to check that any private algorithm must cost fl{s ■ I) 
in expectation. Thus the lower bound in fact rules out additive approximations. □ 

6.5 An Inefficient Algorithm for Weighted Set Cover 

For completeness, we now show that the lower bound shown above is tight even in the weighted case, in the 
absence of computational constraints. Recall that we are given a collection S of subsets of a universe U, 
and a private subset R C U of elements to be covered. Additionally, we have weights on sets; we round up 
weights to powers of 2, so that sets in Sj have weight exactly 2^-'. Without loss of generality, the largest 
weight is 1 and the smallest weight is w = 2~^. 

As before, we will output a permutation tt on 5, with the understanding that the cost cost{R,TT) of a 
permutation tt on input R is defined to be the total cost of the set cover resulting from picking the first set 
in the permutation containing e, for each e G i?. 

Our algorithm constructs this permutation in a gradual manner. It maintains a permutation nj on Ui<jSi 
and a threshold Tj. In step j, given ttj-i and the algorithm constructs a partial permutation tTj on 

Ui<jSj, and a threshold Tj. In each step, we use the exponential mechanism to select an extension with an 
appropriate base distribution fij and score function q. At the end of step L, we get our permutation tt = tt^ 
on S. 
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Our permutations nj will all have a specific structure. The weight of the ith set in the permutation, as a 
function of i will be a unimodal function that is non-increasing until Tj, and then non-decreasing. In other 
words. TTj contains sets from Sj as a continuous block. The sets that appear before Tj are said to be in the 
bucket. We call a partial permutation respecting this structure good. Wc say a good permutation tt extends 
a good partial permutation TTj if iTj and tt agree on their ordering on Ui<jSi. 

We first define the score function that is used in these choices. A natural objective function would be 
cost{R^-Kj) = min^ extends -k- cost{R,7r), i.e. the cost of the optimal solution conditioned on respecting the 
partial permutation ttj. We use a slight modification of this score function: we force the cover to contain 
all sets from the bucket and denote as cost{R, tt) the resulting cover defined by R on tt. We then define 
cost{R^-Kj) naturally as min^ extends tt cost{R,iT). We first record the following easy facts: 

Observation 6.13. For any R, min^r cost{R, tt) = min^r cost{R, tt) = OPT. Moreover, for any tt, cost{R, tt) > 
cost{R, tt). 

To get (tTj, Tj) given (j^j-i, Tj_i), we insert a permutation aj of Sj after the first Tj_i elements of t^j-i, 
and choose Tj, where both aj and Tj are chosen using the exponential mechanism. The base measure on aj 
is uniform and the base measure on Tj — Tj-i is the geometric distribution with parameter 

Let cost(A, {aj ,Tj)) be defined as cost{A,Trj) — cost(A, 7rj_i), where TTj is constructed from TTj-i and 
{aj,Tj) as above. The score function we use to pick {aj,Tj) is scorej{R,{aj,Tj)) = 2^cost{R,{aj,Tj)). 
Thus Pr[{aj,Tj)] cx {l/m^{Tj - Tj.i)) exp(£score((CTj, T,))). 

Let the optimal solution to the instance contain Uj sets from Sj. Thus OPT = ^ ■ 2^^nj. We first show 



that cost{R,'ni,) is 0(OPT logm/e). By Observation 6.13, the approximation guarantee would follow. 



The probability that the Uj sets in OPT fall in the bucket when picking from the base measure is at least 
When that happens, cost{R,Trj) ~ cost{R,Trj-i). Thus the exponential mechanism ensures that 
except with probability l/poly{m): 

cost{R,TTj) < cost{R,TTj^i) + 4 • 2"-''log(m^"^)/e = cost{R,TTj_i) + 12 • 2^^nj logm/e 

Thus with high probability, 

cost{R,TTL) < cost(i?, TTo) + 12 ^ 2~"'njlogTO/e 

j 

= OPT+ 120PTlog?7i/e 

Finally, we analyze the privacy. Let e e C/ be an element such that the cheapest set covering U has 
cost 2"^"=. Let A and B be two instances that differ in element e. It is easy to see that \cost{A, {aj,Tj)) — 
cost{B, [aj, Tj))\ is bounded by 2"-' for all j. We show something stronger: 

Lemma 6.14. For any good partial permutation TTj and any A, B such that A = B U {e}, 
\score{A, {a,,T,)) - score,{B, {a,,Tj))\ < | > 

Proof. Let ttb be the permutation realizing cost{B). For j < je, if e is covered by a set in the bucket in ttb, 
then the cost of ttb is no larger in instance A and hence cost{A,TTj) ~ cost{B,TTj)^. In the case that the 
bucket in ttb docs not cover e, then cost{A,Trj) < cost{A,TrB) = cost{B,TTB) + 2^^' ~ cost{B,Tij) + 2~^=. 
Since this also holds for 7rj_i, this implies the claim from j < j^. 

For j > je, observe that the first set in ttb that covers e is fully determined by the partial permutation 
TTj, since the sets in Ui>j^5i do not contain e. Thus cost{A,{aj,Tj)) = cost{B,{aj,Tj)) and the claim 
follows. □ 



^We remark that this is not true for the function cost, and is the reason wo had to modify it to cost. 
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Then for any j < je, lemma 6.14 implies that for any (cj ,Tj), exjp{e{score{A, [oj , Tj))—score{B, {<j.j ,Tj)))) g 



[exp(-e2J--^'+i), exp(e2J<=-J+i)]. Thus 



Moreover, for any j > je, this ratio is 1. Thus 
Pr[a,,T,\A] 



PrK-,T,|B] ^ [n.,.cxp(-2->-.),n,,,.exp(2->-.)] 
C [cxp(-8e),cxp(8e)], 

which implies 8e-differential privacy. 

7 Facility Location 

Consider the metric facility location problem: we are given a metric space (V,d), a facility cost / and a 
(private) set of demand points D CV. We want to select a set of facilities F C y to minimize J2veD ^(''^^ -^) + 
/ • \F\. (Note that we assume "uniform" facility costs here instead of different costs fi for different i G V.) 
Assume that distances are at least 1, and let A ^ ma xu.t, d{u,v) denote the diameter of the space. 



We use the result of Fakcharoenphol et al. | FRT04 1 that any metric space on n points can be approximated 
by a distribution over dominating trees with expected stretch O(logn); moreover all the trees in the support 
of the distribution are rooted 2-HSTs — they have L = 0(log A) levels, with the leaves (at level 0) being 
exactly = V, the internal nodes being all Steiner nodes, the root having level L, and all edges between 
levels (i + 1) and i having length 2*. Given such a tree T and node v at level i, let T„ denote the (vertices 
in) the subtree rooted at v. 



By Corollary 4.6 , it is clear that we cannot output the actual set of facilities, so we will instead output 
instructions in the form of an HST T = {Vr, Et) and a set of facilities F C Vr- each demand x £ D then 
gets assigned to its ancestor facility at the lowest level in the tree. (We guarantee that the root is always in 
F, hence this is well-defined.) Now we are charged for the connection costs, and for the facilities that have 
at least one demand assigned to them. 

Algorithm 7 The Facility Location Algorithm 



9: 
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Input: Metric {V,d), facility cost /, demands D C V,e. 

Pick a random distance-preserving FRT tree T; recall this is a 2-HST with L = 0(log A) levels, 
let F <— root r. 
for z = 1 to L do 

for all vertices v at level i_do 

let Ny = \DnTy\ and N.y = N.y + Lap(L/e). 
if TVj, • 2* > / then F ^ FUv. 
end for 
end for 

output (T, F): each demand a; e D is assigned to the ancestor facility at lowest level in T. 



Theorem 7.1. The above algorithm preserves e-differential privacy and outputs a solution of cost OPT • 
0(lognlogA).iH£Aiog(^«io£lAy 

For the privacy analysis, instead of outputting the set F we could imagine outputting the tree T and 
all the counts iV„; this information clearly determines F. Note that the tree is completely oblivious of the 
demand set. Since adding or removing any particular demand vertex can only change L counts, and the 
noise added in Step ^ gives us e/ L-diffcrential privacy, the fact that differential privacy composes linearly 
gives us the privacy claim. 

For the utility analysis, consider the "noiseless" version of the algorithm which opens a facility at v when 
■ 2' > /. It can be shown that this ideal algorithm incurs cost at most / -I- 0(lognlog A) • OPT (see, 

e.g., [Ind04, Theorem 3]). We now have two additional sources of error due to the noise: 
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• Consider the case when Ny • 2' > / > iV^, • 2\ which increases the connection cost of some demands in 
D. However, the noise is symmetric, and so we overshoot the mark with probabiHty at most 1/2 — and 
when this happens the 2-HST property ensures that the connection cost for any demand x increases 
by at most a factor of 2. Since there are at most L = O(logA) levels, the expected connection cost 
increases by at most a factor of L. 

• Consider the other case when iV^, • 2* < / < A^^, • 2*, which increases the facility cost. Note that if 
-^i' '2' > //2, then opening a facility at v can be charged again in the same way as for the noiseless 
algorithm (up to a factor of 2). Hence suppose that — Ny > i(//2*), and hence we need to consider 
the probability Pi of the event that Lap(L/e) > ^(//2*), which is just exp(— girrx)- 

Note that if for some value of i, / > ^^^^ log-^^, the above probability pi is at most l/Ln, and 
hence the expected cost of opening up spurious facilities at nodes with such values of i is at most 
(l/Ln) • Ln ■ f = /. (There are L levels, and at most n nodes at each level.) 

For the values of i which are higher; i.e., for which / < — log-^^, we pay for this facility only if 
there is a demand x € D in the subtree below v that actually uses this facility. Hence this demand x 
must have used a facility above v in the noiseless solution, and we can charge the cost / of opening 
this facility to length of the edge 2'+^ above v. Thus the total cost of spurious facilities we pay for is 
the cost of the noiseless solution times a factor log . 
Thus the expected cost of the solution is at most 

OPT . 0(lognlog A) . log 1^^}^^ . (7.8) 

8 Combinatorial Public Projects (Submodular Maximization) 

Recently Papadimitriou et al. pSSO^ introduced the Combinatorial Public Projects Problem (CPP Problem) 
and showed that there is a succinctly representable version of the problem for which, although there exists 
a constant factor approximation algorithm, no efficient truthful algorithm can guarantee an approximation 
ratio better than to2-<^, unless NP C BPP. Here we adapt our set cover algorithm to give a privacy 
preserving approximation to the CPP problem within logarithmic (additive) factors. 

In the CPP problem, we have n agents and m resources publicly known. Each agent submits a private 
non-decreasing and submodular valuation function fi over subsets of resources, and our goal is to select a size- 
k subset S of the resources to maximize X]"=i fi{^)- We assume that we have oracle access to the functions 
fi. Note that since each fi is submodular, so is Y^^=i fii^)^ ^"^^ S^^l ^o produce a algorithm for 
submodular maximization that preserves the privacy of the individual agent valuation functions. Without loss 
of generality, we will scale the valuation functions such that they take maximum value 1: max^^g fi{S) = 1- 

Once again, we have an easy computationally inefficient algorithm. 

Theorem 8.1. The exponential mechanism when used to choose k sets runs in time 0{{^^poly(n)) and has 
expected quality at least (1 - l/e)OPT - 0(log (™) /e). 

We next give a computationally efficient algorithm with slightly worse guarantees. We adapt our un- 
weighted set cover algorithm, simply selecting k items greedily: 



Algorithm 8 CPP Problem 
1: Input: A set of AI of m resources, private functions /i, . . . , /„, a number of resources k, e, S. 
2: let Ah ^ M, F{x) 1 M^)^ ^ 0, e' ^ 

3: for i = 1 to fc do 

4: pick a resource r from Mi with probability proportional to exp{e'{F{Si + {r}) — F{Si))). 

5: let M,+i ^ M, - {r}, S^+i ^ + {r}. 

6: end for 

7: Output Sk+l- 
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8.1 Utility Analysis 



Theorem 8.2. Except with probability 0{l/poly{n)), the algorithm for the CPP problem returns a solution 
with quality at least (1 — l/e)OPT — 0(fc log?7i/e'). 

Proof. Since F is submodular and there exists a set S* with |5| = k and F{S) — OPT, there always exists 
a resource r such that ^(5*^ + {r}) — F{Si) > (OPT — F{Si))/k. If we always selected the optimizing 
resource, the distance to OPT would decrease by a factor of 1 — 1/fc each round, and wc would achieve 



an approximation factor of 1 — 1/e. Instead, we use the exponential mechanism which, by (2.4), selects a 
resource within 41nm/e' of the optimizing resource with probability at least 1 — With probability at 

least 1 — k/m^ each of the k selections decreases OPT — F{Si) by a factor of (1 — l/k), while increasing it 
by at most an additive 41nTO/e', giving (1 — l/e)OPT + 0{k\nm/e'). □ 

8.2 Privacy Analysis 

Theorem 8.3. For any S < 1/2, the CPP problem algorithm preserves (e'(e — l)ln(e/S), 6) -differential 
privacy. 

Proof. Let A and B be two CPP instances that differ in a single agent / with utility function //. Wc show 
that the output set of resources, even revealing the order in which the resources were chosen, is privacy 
preserving. Fix some ordered set of k resources, tti, . . . ,7Tk write Si = denote the first i — 1 

elements, and write Sij{A) = FA{Si + {j}) — FA^Si) to denote the marginal utility of item j at time i in 
instance A. Define Sij{B) similarly for instance B. We consider the relative probability of our mechanism 
outputting ordering tt when given inputs A and B: 

Pr[AI{A) ^ tt] _ a / cxp(£' ■ s».^,(A))/(E^-exp(e^ ■ g».j(^))) \ 
Pr[A/(i3) = tt] " l\ l^cxp(6' . s.,..(B))/(E, exp(6' • s,,,(i?))) ) ' 

where the sum over j is over all remaining unselected resources. We can separate this into two products 



^ / expje' ■ s,.^,iA)) \ A / Ej exp(£' ■ g,,, (i?)) ' 
l\ Uxp(e' • s,,^.{B))J ■ l{ exp(6' . s,.,{A))^ 



If A contains agent / but B does not, the second product is at most 1, and the first is at most exp(e' EiLi i-^ii^i)' 
Fj{Si^i))) < exp(e'). If B contains agent /, and A docs not, the first product is at most 1, and in the re- 
mainder of the proof, we focus on this case. We will write (3i j — Sij(B) — Si j{A) to be the additional 
marginal utility of item j at time i in instance B over instance A, due to agent /. Thus 

Fr[M{A) = tt] ^ yr (Ejexp(/^^^A^\ 
Pr[MiB)=n] " ,tU E, cxp(e' • s„(A)) j 

■ J2j exp(e'A j) • exp(e' • (A)) \ 
E,exp(e'-s,.,(A)) J 

1=1 

where (3i is the marginal utility actually achieved at time i by agent /, and the expectation is taken over the 
probability distribution over resources selected at time i in instance A. For all a; < 1, < 1 + (e — 1) • x. 
Therefore, for all e' < 1, we have: 



[]E,[exp(e'A)] < Y[E,[l + ie-l)e'(3,] 

4=1 1=1 

k 

< exp{{e^l)e'J2E^m)■ 
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As in the set-cover proof, we split the set of possible outputs into two sets. We call an output sequence 
q-good for an agent / in instance A if this sum X)i=i ^iiPi] is bounded above by q, and call it q-bad otherwise. 
For a (ln(e(5~^))-good output tt, we can then write 

p-^^^^^^^^ < cxp((e - 1). . ln(e^ ). 

Moreover, note that since the total realized utility of any agent is at most 1, if agent / has realized utility 



Ui-i before the ith set is chosen, then /3j is distributed in [0, 1 — Ui-i]. Moreover, Ui = m-i + Pi. Lemma B.2 
then implies that the probability that the algorithms outputs a (ln(e(5~^))-bad permutation is at most S. 
The theorem follows. □ 

Remark 1. By choosing e' ~ e/k, we immediately get e- differential privacy and expected utility at least 
(1 — l/e)OPT — 0(fc^ Inm/e). This may give better guarantees for some values of k and 5. 

We remark that the fc-coverage problem is a special case of the CPP problem. Therefore: 

Corollary 8.4. The CPP algorithm (with sets as resources) is an (e, 5) -differential privacy preserving algo- 
rithm for the k-coverage problem achieving approximation factor at least (1 — l/e)OPT— 0(fc logm \og(2/d)/e). 

8.3 Truthfulness 

The CPP problem can be viewed as a mechanism design problem when each agent i has a choice of whether 
to submit his actual valuation function /,;, or to lie and submit a different valuation function f- if such a 
misrepresentation yields a better outcome for agent i. A mechanism is truthful if for every valuation function 
of agents j ^ i, and every valuation function /,; of agent i, there is never a function /■ ^ fi such that agent 
i can benefit by misrepresenting his valuation function as Intuitively, a mechanism is approximately 
truthful if no agent can make more than a slight gain by not truthfully reporting. 

Definition 3. A mechanism for the CPP problem is ^-truthful if for every agent i, for every set of player 
valuations fj for j ^ i, and for every valuation function f[ ^ fi: 

Elf{M{fu ./„))] > E[f,{M{f,, ...,/;,..., /„))] - 7 

Note that -truthfulness corresponds to the usual notion of (exact) truthfulness. 

(e, i5)-differential privacy in our setting immediately implies (2e + (5)-approximate truthfulness. We note 



that Papadimitriou et al. [PSS08| showed that the CPP problem is inapproximablc to an m^~'^ multiplicative 
factor by any polynomial time 0-truthful mechanism. Our result shows that relaxing that to 7-truthfulness 
allows us to give a constant approximation to the utility whenever OPT > 2fclogmlog(l/7)/7 for any 7. 

8.4 Lower Bounds 

Theorem 8.5. No e- differentially private algorithm for the maximum coverage problem can guarantee profit 
larger than OPT - (fclog(TO/fc)/20e). 



The proof is almost identical to that of the lower bound Theorem 4.5 for fc-mcdian, and hence is omitted. 

9 Steiner Forest 

Consider the Steiner network problem, where we are given a metric space M = {V, d) on n points, and 
a (private) subset R <Z V y. V oi source-sink (terminal) pairs. The goal is to buy a minimum-cost set of 
edges E{R) C (^) such that these edges connect up each terminal pair in R. As in previous cases, we give 
instructions in the form of a tree T = (V, Et)', each terminal pair (m, v) € R takes the unique path Pt{u, v) 
in this tree T between themselves, and the (imphcit) solution is the set of edges E{R) = (J^^^ ^j^^j Pt{u, v). 



The tree T is given by the randomized construction of Fakcharoenphol et al. | FRT04 ] , which guarantees 
that E[cost(ii^(i?))] < O(logn) • OPT; moreover, since the construction is oblivious to the set R, it preserves 
the privacy of the terminal pairs perfectly (i.e., e = 0). The same idea can be used for a variety of network 
design problem (such as the "buy-at-bulk" problem) which can be solved by reducing it to a tree instance. 
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10 Private Amplification Theorem 

In this section, we show that differentially private mechanisms that give good guarantees in expectation can 
be repeated privately to amplify the probability of a good outcome. First note that if we simply repeat a 
private algorithm T times, and select the best outcome, we can get the following result: 

Theorem 10.1. Let M : D —i- R be an e- differentially private mechanism such that for a query function q, 
and a parameter Q, Pr[q{A,M{A)) > Q] > ^- Then for any 5 > 0, e' £ (0, there is a mechanism M' 
which satisfies the following properties: 

• Utility: Fr[q{A,M{A)) > Q] > (1 - 2-^). 

• Efficiency: M' makes T calls to M. 

• Privacy: A/' satisfies [eT)- differential privacy. 

Note that the privacy parameter degrades linearly with T. Thus to bring down the failure probability to 
inverse polynomial, one will have to make T logarithmic. To get e'-differential privacy, one would then take 
e to be e'/T. If Q was inversely proportional to e, as is the case in many of our algorithms, this leads to an 
additional logarithmic loss. The next theorem shows a more sophisticated amplification technique that does 
better. 

Theorem 10.2 (Private Amplification Theorem). Let M : D R be an e- differentially private mechanism 
such that for a query function q with sensitivity 1, and a parameter Q, Fr[q{A, M {A)) > Q]> p for some 
p G (0, 1). Then for any S > 0, e' € (0, there is a mechanism M' which satisfies the following properties: 
. Pr[g(A, M{A)) > Q - 1 \og{^^)] >{l-5). 

• M' makes log(^)) calls to M. 

• M' satisfies {e + 8e')-differential privacy. 

Proof Let T = (77!^)^ log( p^). The mechanism M' runs M on the input A independently (T + 1) times 
to get outputs Si — {ri, . . . , rx+i}- It also adds in T' = V4T log t ^j^^-^j-Qy outcomes S2 ~ {si, . . . , st'} 
and selects an outcome from Si U S2 using the exponential mechanism with privacy parameter s' and score 
function 

mm{Q , q{A, r)) if r e 5i 
Q if r e 52 



g(^,r) 



The efficiency of M' is immediate from the construction. To analyze the utility, note that (2^) ensures 
that the exponential mechanism's output r satisfies q{A,r) > Q — p-log(p^) with probability (1 — |). 
Conditioned on the output r satisfying this property, the ratio Pr[r e 5i]/Pr[r e 52] is at least \{r e Si : 
li^jf) ^ Q}|/|'S'2|- Since the numerator is at least pT in expectation, the probability of r being a dummy 
outcome is at most |. This establishes the utility property. 



We now show the privacy property. For any tq G R, 

exp{e'q{A,roj) 



Pr[M'(^) = ro] 



T+l 

^ Pr[r. 



ro]E 



exp(e'g(A, r)) + T' exp(e'Q) 
(T + 1) • Pr[A/(yl) outputs ro] ■ exp{e'q{A, ro)) ■ E 



ro 



E.esi exp(e'g(A, r)) + T' exp(e'Q) 
(T + 1) • Pr[A/(yl) outputs ro] • cxp{e'q{A, ro)) ■ cxp(-e'Q)- 

1 



E 



exp(e'(g(A,r) - Q)) + exp(e'(g(A, ro) - Q))+T' 

T of M (we've explicitly conditioned on run (T 



(10.9) 



1) 



where the expectation is also taken over runs 1, 
producing ro). 

It is easy to bound the change in the first two terms when we change from input A to a neighboring input 
i3, since M satisfies e-differential privacy, and g has sensitivity 1. Let D — D{A) denote the denominator in 
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the final expectation; we would like to show that E[-jj^!^] < exp(e)E[-p^!^] for neighboring inputs A and B. 
Let C = exp(e'(^(^, tq) — Q)) + T' denote the constant term in D{A). 
First observe that 

nD{A)] = C + T.E,,gM(A)[exp(e'(g(AO-Q)] 

> C + T- exp(-e') • E,eM(A) [exp(e'(g(B, r) - Q)] 

> C + T- exp(-2e') • E,gM(B) [cxp(e'(g(B, r) - Q)] 

> exp(-2e') •E[i:i(B)], 

where the first inequality follows from the sensitivity of q and the second from the e-differential privacy of 
Af . Thus E[£'(^)] is close to E[£'(i?)]. We now show that E[-pp^] is close to ^^^-^^^^j for each A, which will 
complete the proof. 

The first step is to establish that D{A) is concentrated around its expectation. Since D = C + X^iLi 
where the y,;'s are i.i.d. random variables in [0, 1], standard concentration bounds imply 

Pr[D > E[D] +t]< exp(-2tVr); Pr[D < E[D] - t] < cxp{~2t^ /T); 

Since > ^, we can now estimate 



, 1 , exp(e') fi „ , 1 

^ exp(eO^ /■°"P(-^')^t^l Pr[£) < z\ 



E{D\ 
exp(e') 1 



ic 

.oxp(-e')E[£'] 

- ^i[Df^C^' exp(-2(z-E[i5])Vr)dz 



< 



exp(e') 1 



since E{D\ >C> v^lZpH, E[i?] < 2T, and C > 1. Thus E[i] < ^Sg^. 
Similarly, 

CXp(-€') 

exp(-e') f no] 1 
E — > , , ^ - / Pr — < y]dv 

^D^ - E[D] Jo ^ 

^ exp(~eO _ p Pv[D>z] ^_^ 

^ ^rfer - ^^TTT?^ r exp(-2(.-E[i.])VT)d. 
^ exp(-e') _ exp(~2£') ^/^ 



> 



E[i:'] E[i:)]2 

exp(-eO e' 

E[i:)] E[i:)] ' 



so that E[-i] > 



Thus E[-jjp^] < exp(7e')E[ij^] for neighboring inputs A and B. Now using this fact in expression ( 10.9| ) 
for Py[M' {A) ~ ro] above, we conclude that M' satisfies (e + 8e')-differential privacy. □ 
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A Unweighted Vertex Cover Algorithm: An Alternate View 

In this section, we consider a slightly different way to implement the vertex cover algorithm. Given a 
graph G = (V,E), we mimic the randomized proportional-to-degree algorithm for an rounds {a < 1), and 
output the remaining vertices in random order. That is, in each of the first an rounds, we select the next 
vertex i with probability proportional to d{i) -\-l/e: this is equivalent to imagining that each vertex has 1/e 
"hallucinated" edges in addition to its real edges. (It is most convenient to imagine the other endpoint of 
these hallucinated edges as being fake vertices which are always ignored by the algorithm.) 

When we select a vertex, we remove it from the graph, together with the real and hallucinated edges 
adjacent to it. This is equivalent to picking a random (real or hallucinated) edge from the graph, and 
outputting a random real endpoint. Outputting a vertex affects the real edges in the remaining graph, but 
docs not change the hallucinated edges incident to other vertices. 
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Privacy Analysis. The privacy analysis is similar to that of Theorem 5.1: imagine the weights being 
Wi = 1/e for the first an rounds and Wi = oo for the remaining rounds, which gives us '2-'Y^^=(i-a)n i^T — 
e ( j^^)-diffcrcntial privacy. 

Utility Analysis. To analyze the utility, we couple our algorithm with a run of the non-private algorithm 
A that at each step picks an arbitrary edge of the graph and then picks a random cndpoint: it is an easy 
exercise that this an 2-approximation algorithm. 

We refer to vertices that have non-zero "real" degree at the time they are selected by our algorithm as 
interesting vertices: the cost of our algorithm is simply the number of interesting vertices it selects in the 
course of its run. Let /i denote the number of interesting vertices it selects during the first an steps, and 
I2 denote the number of interesting vertices it selects during its remaining (1 — a)n steps, when it is simply 
ordering vertices randomly. Clearly, the total cost is Ii + l2- 

We may view the first phase of our algorithm as selecting an edge at random (from among both real and 
hallucinated ones) and then outputting one of its endpoints at random. Now, for the rounds in which our 
algorithm selects a real edge, we can couple this selection with one step of an imagined run of A (selecting 
the same edge and endpoint). Note that this run of A maintains a vertex cover that is a subset of our vertex 
cover, and that once our algorithm has completed a vertex cover, no interesting vertices remain. Therefore, 
while our algorithm continues to incur cost, A has not yet found a vertex cover. 

In the first phase of our algorithm, every interesting vertex our algorithm selects has at least one real 
edge adjacent to it, as well as 1/e hallucinated edges. Conditioned on selecting an interesting vertex, our 
algorithm had selected a real edge with probability at least e' = 1/(1 + 1/e). Let R denote the random 
variable that represents the number of steps A is run for. E[R] < 20PT since A is a. 2-approximation 
algorithm. By linearity of expectation: 

20PT > E[R] > e' ■ E[Ii] (A.IO) 

We now show that most of our algorithm's cost comes from the first phase, and hence that I2 is not much 
larger than Ii . 

Lemma A.l. 

E[/i] >ln(^^^ - Eih] 

Proof. Consider each of the an steps of the first phase of our algorithm. Let denote the number of 
interesting vertices remaining at step i. Note that {ni} is a non-increasing sequence. At step i, there are 
ni interesting vertices and n ~ i + 1 remaining vertices. Note that the probability of picking an interesting 
vertex is strictly greater than ni/ [n — i -\- \) at each step. Wc may therefore bound the expected number of 
interesting vertices picked in the first phase: 

an IT? r 1 ^1 / 1 

1=1 j={l-a)n ^ 

Noting that E[/2] < E[ricm] completes the proof. □ 
Combining the facts above, we get that 

Efcost] 2 / 1 \ , 

B Missing Proofs 

In this section, wc prove Lemma |6.4 The lemma is a consequence of the following more general inequality. 

Consider the following n round probabilistic process. In each round, an adversary chooses a G [0, 1] 
possibly based on the first (i — 1) rounds and a coin is tossed with heads probability pj. Let Zi be the 
indicator for the the event that no coin comes up heads in the first % steps. Let denote the random 
variable Vi^i a-nd let Y — Y\. 
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Lemma B.l. Let Y be defined as above. Then for any q, Pr[K > q] < exp(— 5). 



Proof. Wc claim that for any j and any q, Pr[Yj > q] < cxp(— g). which impUcs the lemma. The proof 
is by reverse induction on j. For j = n, Yn is if the nth coin or any coin before it comes up heads and 
Pn otherwise. Thus for q > pn, the left hand side is zero. For q € [0,p„), the left hand side is at most 
(1 — Pn) < exp(— p„) < exp(— (7). Finally, for q <Q the right hand side exceeds 1. 

Now suppose that for any adversary's strategy and for all q, Pr[Yj_|_i > g] < exp(— g). We will show 
the claim for Yj. Once again, for q < 0, the claim is trivial. In round j, if the adversary chooses pj, 
there is a probability pj that the coin comes up heads so that Yj = 0. Thus for any q > 0, P^[Yj > q] ~ 
Pr[pjZj + Yj+i > q] = {1 ~ pj)Pr[Yj_|_i > q — pj]. Using the inequality (1 — x) < exp(— x) and the inductive 
hypothesis, the claim follows for Yj. □ 



To map the randomized algorithm to the setting of lemma B.l , we consider running the randomized 
weighted set cover algorithm as follows. When choosing a set S in step i, the algorithm first tosses a 
coin whose heads probability is Pi{A) to decide whether to pick a set covering / or not. Then it uses a 
second source of randomness to determine the set S itself, sampling from {S : I € S} or {S : I ^ S} with the 
appropriate conditional probabilities based on the outcome of the coin. Clearly this is a valid implementation 
of the weighted set cover algorithm. Note that the probabilities Pi{A) may depend on the actual sets chosen 



in the first {i — 1) steps if none of the first {i — 1) coins come up heads. Since lemma B.l applies even when 
Pi{Ays are chosen adversarially, lemma follows. 



We also prove a more general version of Lemma B.l that applies to non-Bernoulli distributions. This 



lemma will be needed to prove the privacy of our algorithm for submodular minimization in Section g[ We 
now consider a different n round probabilistic process. In each round, an adversary chooses a distribution 
Pi over [0, 1], possibly based on the first (i — 1) rounds and a sample Ri is drawn from the distribution Vi. 
Let Zq = 1 and let Zi+i = Zi — RiZi. Let Yj denote the random variable X]j=i ZiE[Ri] and let Y denote 
Yi. 

Lemma B.2. Let Y be defined as above. Then for any q, Pr[Y > q] < eexp(— g). 

Proof. We prove a stronger claim. We show that for Pr[Yj > qZj] < eexp(— g). The proof is by reverse 
induction on j. For j = 71, Yn ~ E[Rn]Zn < Zn since !?„ is supported on [0, 1] and hence has expectation 
at most 1. Thus the claim is trivial for any q > 1. For q < I, the right hand side is at least 1 and there is 
nothing to prove. Supppose that for any q and any strategy of the adversary, Pr[lj+i > qZj^i] < eexp(— q). 
We show the claim for Yj. Once again the case 9 < 1 is trivial, so we assume g > 1. Let fij denote E[Rj]. 
Note that Yj = Z^fij + Yj+i. Moreover, Z^+i = (1 - Rj)Zj. Thus, 

Pr[y, > qZ,] = En,evAPr[Y,+, > qZ,-^i,Z,]] = En^evA^v[Y,+, > l^Zj+,]] < En,evAecxp{-f^)]. 

I- Rj 1 - Rj 

We show that for any distribution V, the last term is bounded by eexp(— g), which will complete the proof. 
Re-arranging, it suffices to show that for any distribution V on [0, 1], 

EnMeMY^)]<^- 

Since ^j^^ is positive when R < ji/q and negative otherwise, one can verify that for any R, exp( ^j^^'^^ ) < 
exp( ^j^~'if )■ Moreover, since cxp(-) is convex, the function lies below the chord and we can conclude that 
exp(ii^) < exp( + R{c-xp{^) - exp( j^)). Thus it suffices to prove that 

Q Q g q 

exp(Y3^) + M(exp(^^^— ^) - cxp(y^^)) < 1, 
9 99 

or equivalently 

1 + ^(exp(^--^) - 1 < cxp(^--^). 

9 9 
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This rearranges to 



1 - exp(--; ^) < ^(1 - exp(--— ^)). 



9 9 



Consider the function /(a;) = 1 — exp(— j^)- / is convex with /(O) = and /(I) < /(g) = (1 — exp(— yz-^))- 
Thus f{fi) < fj.f{l) < fifiq), for q>l. The claim foUows. ' □ 
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